Picture for Yao Hu

Yao Hu

Alibaba Group

EvoCut: Multi-Layer Evolution-Aware Visual Token Compression for Efficient Large Vision-Language Models

Add code
Jun 01, 2026
Viaarxiv icon

Deep Research as Rubric for Reinforcement Learning

Add code
May 31, 2026
Viaarxiv icon

Preference-Aware Rubric Learning for Personalized Evaluation

Add code
May 29, 2026
Viaarxiv icon

UniNote: A Unified Embedding Model for Multimodal Representation and Ranking

Add code
May 28, 2026
Viaarxiv icon

AgentCVR: Active Multi-Agent Cross-Video Reasoning via Script-Simulated Reinforcement Learning

Add code
May 28, 2026
Viaarxiv icon

UnityMAS-O: A General RL Optimization Framework for LLM-Based Multi-Agent Systems

Add code
May 26, 2026
Viaarxiv icon

Tournament-GRPO: Group-Wise Tournament Rewards for Reinforcement Learning in Open-Ended Long-Form Generation

Add code
May 26, 2026
Viaarxiv icon

Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling

Add code
May 26, 2026
Viaarxiv icon

Beyond Literal Translation: Evaluating Cultural Effectiveness in Social Media UGC

Add code
May 25, 2026
Viaarxiv icon

Evolving-RL: End-to-End Optimization of Experience-Driven Self-Evolving Capability within Agents

Add code
May 11, 2026
Viaarxiv icon